Cloning method and kit

ABSTRACT

A method of amplifying target DNA is disclosed wherein said DNA is first amplified by PCR, the amplified DNA then being contacted with a single stranded linearised plasmid vector having terminal regions which are complementary to terminal regions of the PCR amplified DNA, whereby a cyclic product is formed comprising single stranded sequences from said target DNA and said vector and two double stranded regions from the overlapping terminal regions of the vector and the PCR amplified DNA; the cyclic product then being introduced into a host organism. Two-stage PCR may be performed and site-specific mutagenesis may be effected between PCR amplification and formation of the cyclic product. The single-stranded linearised plasmid vector and/or the target DNA may be immobilised. Kits are disclosed for performing various aspects of the method which can be used in a method of diagnosis wherein the target DNA is characteristic of a physiological condition.

This invention relates to a cloning method and to a kit for performingsame.

The ability to splice genes, gene fragments or other target DNA intovectors or other pieces of DNA using restriction enzymes (RE) andligases has been an important aspect in the advance of molecular biologyand biotechnology. However, the technology of recombination or genesplicing has several disadvantages. Firstly, there is the need forconveniently situated restriction sites and often sites have to beconstructed which not only takes time but can lead, in the long term, tomismatched reading frames and for example non-expression of a gene ofinterest. Such sites are usually introduced by means of oligonucleotidelinkers which have to be synthesised and purified and are then used inexcess to ensure the addition of the required RE sites(s) on to thetarget DNA. Secondly, the in vitro ligation or splicing of DNA is ratherinefficient and relatively cumbersome screening techniques are requiredto locate desired recombinants. Finally, the technology istime-consuming and is not well-suited to automation. Accordingly, thereis a need for a simple and relatively rapid method of cloning whichavoids the problems of conventional splicing and the use of conventionalplasmids.

It should be noted that conventional plasmids for cloning normally takethe form of double stranded cyclic plasmid structures containing apromoter region separated from a gene or other DNA sequence of interestfor replication or expression by one or more RE sites which permit theDNA of interest to be excised subsequently; such sites are also used forinsertion of the DNA of interest for replication or expression, via oneor more (RE) sites in the linearised plasmid which permit theintroduction of DNA of interest, which is provided with `sticky ends`corresponding to RE sites of the linearised plasmid. When DNA has beensynthesised, for example by cDNA synthesis from mRNA, by mutagenesis orby chemical synthesis, it is in single stranded form which is thenconventionally treated with a polymerase to synthesise the secondstrand, provided with the required `sticky ends`, inserted into thedouble stranded plasmid vector and ligated to join covalently the insertto the vector which is then used to transform a host microorganism, e.g.E. Coli.

In recent years, the polymerase chain reaction (PCR) has been used forthe amplification of target DNA. While this produces increased amountsof the DNA, it is often required to produce larger quantities by cloningin a suitable vector using a host microorganism such as E. Coli.Furthermore, for production of the corresponding protein it is requiredto incorporate the DNA into an expression vector.

For the reasons given above, conventional techniques for splicing thetarget gene into plasmid vectors are time consuming and inefficient andnot well suited to automation. In cases where PCR itself is effected byan automated technique, it would be desirable for incorporation into thevector also to be readily added on to the automated system.

The present invention has as an object a method which provides for theformation of recombinant DNA from PCR amplified DNA without the need forrestriction enzymes or ligases or the provision of restriction sites.

Accordingly, the present invention provides, in one aspect thereof, amethod of amplifying target DNA wherein said DNA is first amplified byPCR, the amplified DNA then being contacted with a single strandedlinearised plasmid vector having terminal which are complementary toterminal regions of the PCR amplified DNA, whereby a cyclic product isformed comprising single stranded sequences from said target DNA andsaid vector and two double stranded regions from the overlappingterminal regions of the vector and the PCR amplified DNA; the cyclicproduct then being introduced into a host organism.

It is surprising that the cyclic product can be used to transform a hostdirectly; the native enzyme system of the host organism is capable ofchain extension to complete synthesis of the double stranded plasmidwhich is then available for replication and/or expression of the DNA ofinterest.

An advantage of the method according to the invention over conventionalPCR is that the target DNA is first amplified by PCR sufficiently togive enough DNA for practical purposes of transformation of a host. Thehost cell replicates the target DNA quite rapidly but highlyconservatively, and without using expensive chemicals such as nucleosidetriphosphates. The conservation in amplification is important sinceconventional PCR is known to suffer from errors introduced bymis-matched codons. Not only are such errors amplified during each cycleof PCR but more errors are created in each cycle and this creates a highbackground level of contamination. Cloning in a host organism can beused to detect errors and select only accurately amplified DNA.

The complementary regions of the PCR amplified DNA may be present in thetarget DNA but advantageously they are provided as single-strandednucleotide extensions on the PCR primers, which extensions or `handles`do not bind to the target DNA (as described in co-pending Internationalpatent application PCT/EP90/00454).

Preferably the PCR amplification step of the method according to theinvention involves nested primers, as described in the above co-pendingPCT application, and this leads to greater sensitivity in isolating andamplifying the target DNA.

Advantageously, the terminal overlapping regions are sufficiently largeto provide an adequate hybridisation overlap between the PCR amplifiedDNA and the terminal regions of the linearised single stranded plasmidso as to form a stable cyclic product, yet still reasonably short inorder to avoid unnecessary chemical synthesis, if using primerextensions. It will be clear to persons skilled in the art that the sizeand stability of the overlapping regions will be dependent to somedegree upon the ratio of A-T to C-G base pairings since more hydrogenbonding is available in a C-G pairing. Also, it will be apparent thatthe smaller the overlap the more likely there will complementarity witha non-terminal region and that if the terminal regions get too largethere is always the possibility that the single strand will fold back onitself and hybridise to give a hairpin or dumbbell structure. It ispreferred that the overlap should be at least a ten base pair overlap,more preferably at least a twenty base pair overlap.

The term "cloning vector" as used herein includes plasmid vectors bothfor simple replication and for expression. A replication vector willcontain an origin of replication and usually a marker e.g. an antibioticresistance marker, to aid recognition. An expression vector willnormally contain promoter and initiator sequences which must be operablyconnected in the same reading frame as the DNA insert if this is to beexpressed correctly, as well as operator and expression controlsequences and a ribosome binding site; appropriate markers e.g.antibiotic resistance markers, are usefully present. In both cases,appropriate RE sites for excision of the DNA will be desirable,especially in the replication vector.

As indicated above, the cyclic product, which is essentially singlestranded apart from the two overlapping regions, may surprisingly beused after annealing to transform a host microorganism, thus avoidingthe steps of second strand synthesis, and modification and ligation.Since such DNA is synthesised chemically in single stranded form, thesimplified procedure of the invention lends itself to the rapid cloningof DNA so synthesised. Also, the method according to the invention iswell suited to automation since no steps of precipitation, extraction,filtration, centrifugation or treatment with enzymes are required ingetting the PCR-amplified DNA into a host cell.

The linearised single stranded vector may conveniently be a standardvector the terminal sequences of which comprise one or more RE sitespermitting a variety of restriction endonucleases to be used to excisethe DNA of interest after replication.

Most cloning vectors now in use have a common ancestry, e.g. pUC18, andinclude the so-called multiple cloning site including several RE sitesflanked by longer regions which are also identical. In the case ofpUC18, the flanking regions are part of the E. coli Lac Z gene. It maybe convenient to include the multiple cloning site or at least one REthereof with the PCR amplified DNA insert and to use the two flankingregions as the overlapping sequences in accordance with the invention.It is thus convenient to provide the DNA insert with terminal regionscomplementary to such standard sequences. the term "complementary" asused above means that the regions hybridise in the correct orientationto form the required cyclic product in which the overlapping 3' ends canserve as primers for synthesis of the remainder of the complete doublestranded vector by the host organism.

The PCR-amplified target DNA can be cloned into different vectorsprovided that complementary overlap regions exist between the vector andthe DNA. This is significant, for example, where it is desired to inserta gene fragment into many different vectors, such as expression vectors.

In general, to ensure adequate hybridisation and stability of the cyclicproduct, the overlapping regions are preferably 20 to 250 base pairs(bp) in length or even longer (e.g. 500 bp), more preferably 40 to 200bp. However, if the overlapping regions are too long, the length of theregion to be amplified may be limited in view of the fact that PCR ismost effective in the region of 500 to 2,000 bp.

The hybridisation reaction is preferably effected in a 1M sodiumchloride solution or an equivalent solution known in the art. (NucleicAcid Hybridisation, B D Hames and S J Higgins, IRL Press, 1985).

In the PCR stage, the unamplified target dsDNA is first denatured andprimers are annealed to both the coding and the non-coding strand. Theprimers are preferably those corresponding to the 5'-terminal sequencesof the DNA so that on extension of the primer with a polymerase, thewhole target DNA sequence of each strand will be replicated. The doublestranded DNA so produced is then denatured by raising the temperaturefollowed by rapid cooling. An excess of the primer molecules is presentand these are annealed to the newly formed coding and non-codingstrands. Extension using polymerase produces further double strandedDNA. The temperature cycling can be repeated many times, therebyproducing a large number of copies of the DNA. Preferably, thepolymerase used is one which can withstand the highest temperature ofthe cycle, commonly the Taq polymerase, otherwise there is a need toseparate the polymerase from the nucleic acids before each heating stepor replenish the polymerase after each cooling step. It is alsopreferred that the polymerase has a high proof-reading ability to avoidmis-matched bases and randomly introduced errors. An example of such apolymerase is vent polymerase available from New England Biolabs. SuchPCR amplification provides target DNA incorporating the primers whichare used. As mentioned above, nested primers may be used. In this casePCR is carried out with a first set of primers for a given number ofcycles e.g. about 25. The amplified DNA is then contacted with a secondpair of primers, one or both being different from the primers usedearlier and being inboard of the binding sites of the first primers.

Since the method of the invention uses single stranded amplified DNA, itis advantageous for one PCR primer to carry means for immobilisation,e.g. a biotin molecule, or to be already attached to a support.

The double stranded amplified DNA may then be subjected to strandseparation whereby one strand (unwanted) remains immobilised while theother is liberated into solution and can be combined with the linearisedvector in accordance with the invention. Thus, such strand separationafter PCR is an important preferred aspect of the invention.

However, since the linearised standard vector will hybridise only to oneof the two PCR amplified DNA strands, it is also possible to liberateboth strands into solution by conventional strand separation and toreact these directly with the linearised standard vector. This will,however, be less efficient due to competing re-assembly of doublestranded target DNA.

The PCR stage of the invention may also include a subsequent step ofsite-specific mutagenesis of the target DNA. In one such strategy, astandard linearised single stranded vector can be prepared by taking aplasmid in double stranded form containing two outer RE sites and twofurther inner RE sites inside these, each separated from the outer REsites. The plasmid is cut at one of the inner RE sites and biotinylatedfollowed by restriction at the other inner RE site. This provides alinearised double stranded vector which is then attached to an insolublesupport coated with avidin or streptavidin. One strand of the linearisedvector is thus attached to the support which the other is not and thelatter can then be brought into solution by denaturation. In thisexample, a further plasmid contains the DNA sequence to be mutagenisedflanked by sequences corresponding to the terminal regions of the singlestranded linearised vector. For example the further plasmid may be thestandard vector having the DNA sequence to be mutagenised insertedbetween the two inner RE sites. The further plasmid is subjected to atleast one or two cycles of PCR amplification using primers flanking thetarget DNA sequence (to be mutagenised), these primers being homologouswith the terminal sequences of the linearised vector. For example, theprimers may correspond to the sequences between the outer and inner REsites of the standard vector. One of the primers is provided with meansfor attachment to a support (e.g. a biotin group) or is already attachedto the support. Chain extension provides, after a final strandseparation, the target DNA in single stranded form linked at one end toa support. Hybridisation to a further primer at the 3'-end to initiatechain extension and a mutagenesis primer incorporating the desiredmutation, permits synthesis, in the presence of a polymerase, of a DNAstrand incorporating the mutation and flanked by terminal sequencescomplementary to those of the linearised vector. Strand separation, e.g.by treatment with alkali, liberates the mutagenised strand into solutionwhile the template is immobilised and thus readily separated. Themutagenised ss DNA may then be contacted with the linearised vector andannealed to give a cyclic product in accordance with the invention.

The target DNA may be cDNA produced by reverse transcription from mRNA,and the method according to the invention therefore provides a way ofdirect cloning cDNA. For example, mRNA may be immobilized on a solidsupport bearing poly dT which hybridises to the poly A tails of themRNA.

Preferably, the connection of the poly dT to the solid support includesa suitable RE site. Reverse transcription can then be effectedadvantageously using the poly dT as a primer. The mRNA is then removedleaving the newly synthesised single-stranded cDNA attached to the solidsupport.

The single-stranded cDNA may be made double-stranded by use of asuitable polymerase, e.g. T4 polymerase, and the free end of the cDNAmay have attached thereto a linker using a suitable ligase. In this casethe linker and the sequence proximate the solid support are,advantageously, complementary to the terminal regions of the plasmidvector. The double-stranded cDNA can then be subjected to PCR inaccordance with the invention; the PCR primers corresponding to thelinker sequence and the sequence proximate the solid support.Alternatively, the linker and/or sequence proximate the solid supportmay not be complementary with the terminal regions of the vector inwhich case such complementary regions can be provided by using PCRprimers with handles.

Instead of forming double-stranded cDNA as mentioned above, it ispossible to use a terminal transferase to add several molecules of onetype of nucleotide to the 3' end of the single-stranded cDNA to form atail, for example a dG tail. Thus the single-stranded cDNA comprises a5' poly dT sequence near the support (which sequence hybridised to thepoly A tail of the mRNA) and a 3' tail, for example poly dG. PCR can nowbe initiated using poly dT and poly dC primers. The linearisedsingle-stranded vector preferably has complementary terminal poly dA andpoly dC regions in order to form the cyclic product with the PCRamplified single-stranded cDNA. As mentioned above, it is of coursepossible to use primers in the PCR amplification which comprise eitherpoly dT or poly dC and appropriate handles to form the overlap withcomplementary regions of the vector.

The method according to the invention in combination with direct solidphase DNA sequencing can be used to analyse target DNA, for examplealleles of a locus e.g. the human apolipoprotein E locus and may thus beused for diagnosis of physiological conditions. Such analysis mayinclude sequencing, as will be exemplified below. Moreover, the methodaccording to the invention can be used in a diagnostic method, forexample testing for the presence of a certain allele. For example, thecloning may be followed by direct sequencing to separate and identifyboth alleles in a heterozygote. Such direct clinical sequencing todetect polymorphism has the advantages that non-expected nucleotidechanges in close proximity to the allele analysed will be detected andthat flanking sequences can be used as positive controls in order toverify that the non-expected exchange was not due to the earlier PCRamplification.

Clearly, in such methods as above where one is investigating genomicDNA, overlapping terminal regions are provided by the primers duringamplification; there is no need to provide specific terminal RE sites toallow incorporation of the target DNA into a vector (as required inconventional cloning and amplification protocols). However, it ispreferably that RE sites are incorporated adjacent the target DNA sincesuch RE sites will allow for subsequent excision of the cloned targetDNA from the vector. RE sites can be conveniently provided in theoverlapping terminal regions provided by the primers duringamplification. If, for example, the vector includes a multiple cloningsite having many RE sites, one site can be selected for restriction andthe overlapping regions of the primers will then be complementary to theterminal regions either side of the selected restricted site.

It should be noted that it is preferable to remove any excess primerremaining after PCR amplification since otherwise it will compete withthe terminal regions of the plasmid to hybridise with the amplified DNA.

It will be appreciated that in any of the above systems, thebiotin/avidin or streptavin affinity coupling may be replaced by othersuch coupling using a relatively small molecule and binding partner,e.g. an antigen and antibody therefor, or covalent coupling as indicatedbelow.

An advantage in PCR strategies involving immobilised site-specificmutagenesis is that the template is readily removed completely from thesynthesised DNA, thus avoiding contamination with unmutated DNA.

As mentioned earlier, it may be convenient in some of the above PCRstrategies to use PCR primers having `handles` of DNA not hybridising inthe first cycle of the PCR amplification, such handles corresponding tothe terminal regions of the standard vector while the hybridisingregions of the primers correspond to regions of a source of target DNAe.g. genomic DNA. This applies equally to PCR amplification with orwithout mutagenesis.

The insoluble support, where used, may take a variety of forms, forexample microtitre wells, filters made from materials such as celluloseor nylon, or particles including, for example, sephadex or sepharosebeads or polystyrene latex particles. It is a preferred feature of theinvention to use magnetic particles which may be aggregated onto asurface and then be readily redispersed for a subsequent treatment step,e.g. by physical agitation.

Probe and primer oligonucleotides may be prepared by using any of thecommercially available DNA synthesis devices, e.g. those available fromApplied Biosystems, Inc. (850-T Lincoln Center Drive, Foster City,Calif. 94404) .

Some aspects of the process of the invention are in part disclosed inour International Patent Application No. PCT/EP89/01417 the contents ofwhich are incorporated herein by reference.

The invention also includes kits for carrying out the cloning procedureof the invention comprising one or more of the following:

a) a standard linearised vector in single stranded or double strandedform the said double stranded form immobilised by one end of one strandthereof.

b) an insoluble support carrying one member of a pair of bindingpartners.

c) nucleotides carrying the other member of said pair of bindingpartners.

d) a polymerase.

e) 2 PCR primers corresponding to the terminal regions of said standardvector one of which is adapted to bind to said support.

f) a thermostable polymerase.

g) an alkaline solution for strand separation.

The invention will now be described by way of non-limiting examples withreference to the drawings in which:

FIG. 1 shows a protocol for site-specific mutagenesis using the methodaccording to the invention;

FIG. 2 shows a protocol for amplification of genomic DNA using themethod according to the invention;

FIG. 3 (SEQ ID Nos. 6 and 7) shows a region of a human lipoprotein Egene together with primers; and

FIG. 4A and B show sequencing printouts for two clones.

EXAMPLE 1

In vitro mutagenesis on latex particles.

The protocol shown in FIG. 1 was used

(a) To yield the ss vector template 10 μg of pUC18 were digested withEcoRI in a total volume of 50 μl. The 5' protruding ends were filled inusing Klenow polymerase (5 U) and 2 μl Biotin-7-dATP (BRL), 7.5 μl of abuffer containing 100 mM Tris-HCl (pH 7.5), 100 mM MgCl₂ and 1M NaCl.The volume was adjusted to 75 μl with water. The reaction was performedat room temperature during one hour and after that purified using asephadex G50 spin column. The purified biotinylated linearized vectorwas cut with HindIII. The reaction containing the biotinylated doublestranded DNA was mixed with previously washed Pandex avidin particles,(Baxter Healthcare Corp., Mundelin, Ill., USA).

To yield the ss vector template the immobilized doublestranded DNA wasconverted into singlestranded form by melting off the non attachedstrand by incubation at 37° C. with 20 μl 0.15M NaOH for 10 minutes. ThepH of the supernatant was immediately adjusted with 1.5 μl HAc (1.7M)and 2.2 μl 10×TE (100 mM tris pH 7.5, 10 mM EDTA).

(b) To yield the mutagenesis template, the inserted fragment from pUCRAwas PCR amplified using 10 pmol of primer A (SEQ ID No: 1)TGC-TTC-CGG-CTC-GTA-TGT-TGT-GTG3' and biotinylated primer B (SEQ ID No.:2) Biotin-AAA-GGG-GGA-TGT-GCT-GCA-AGG-CGA3' in 25 μl PCR reactionmixture as recommended by Perkin Elmer and amplified for 20 cycles.After PCR amplification, Pandex avidin particles were added toimmobilize the amplified insert with flanking vector sequences.

(c) To yield template for in vitro mutagenesis the immobilized PCRamplified fragment was made single stranded with 0.15M NaOH for 10minutes at room temperature. 10 pmol were added of each primer Q (SEQ IDNo.: 3) 5'CGG-CTC-GTA-TGT-GTG-GAA-TTG and mutagenesis primer M (SEQ IDNo.: 4) 5'CC-AAT-GCA-TAT-GTG-GTC-GGC-TAC-GCT-GGA-AAT-AGC-GCA-TAT-TTC3'(the original sequence (SEQ ID No. 5) was: CCAAT GCA-TAT-GTG-GTC-GGC-TACCGT GCT GGA AAT AGC-GCA) were annealed to the template immobilized onthe Pandex avidin particles in a solution containing 10 mM Tris-HCl (pH-7.5), 10 mM MgCl₂, 100 μg/ml BSA and 100 mM NaCl. The mixture wasincubated at 65° C. for a few minutes and allowed to cool to roomtemperature.

(d) Extension was performed by adding 1 μl BSA (100 μg/ml), 6 μlpolymerase mix (100 mM Tris-HCl pH 8.8, 10 mM DDT), 50 mM MgCl₂ and 5 mMATP), 6 μl chase (10 Mm each of dNTP) and 3.5 U T₄ DNA polymerase. 1unit of T₄ DNA ligase was added to the beads containing the insertstrand.

The volume was adjusted to 30 μl with water. The mixture was incubatedat RT for 20 minutes followed by incubation on a roller mixer at 37°during two hours.

(e) After extension the beads were washed once with TE. The newlysynthesized strands were melted off by incubation with 20 μl 0.15M NaOHat 37° during 10 minutes. The pH of the supernatant was immediatelyadjusted with 1.5 μl HAc (1.7M) and 2.2 μl 2×TE.

The two supernatants, the single stranded vector and the newly mutatedinsert with flanking vector sequences were mixed and incubated at 70° C.for 10 minutes and allowed to cool to RT.

After annealing of the two strands the CaCl₂ concentration was adjustedto 0.1M and E. coli DH5α was transformed with DNA and spread on TRABplates containing IPTG and X-Gal.

EXAMPLE 2

Direct cloning of the human genomic apolipoprotein E gene using magneticseparation of single stranded DNA.

Materials and Methods

Clinical samples

Leukocytic DNA from venous blood samples from patients having thegenotype E2/4 was kindly provided by A.-C. Syvanen and K. Aalta Setala(1) (Orion Corporation, Helsinki, Finland).

Preparation of the primers

Four PCR primers (denoted RIT113, 114, 123 and 125) were synthesized onan Applied Biosystems 381A DNA synthesizer. 5'-amino modifiedoligunucleotides (RIT123) were synthesized using the reagent Aminolink 2(ABI, USA). A biotin residue was attached to the amino group using thereagent Biotin-X-NHS ester as described by the manufacturer (Clontec,USA) and the biotinylated oligonucleotide was purified by reverse phaseHPLC. The fluorescent M13 universal sequencing primer was purchased fromPharmacia LKB Biotechnology, Sweden. The two primers RIT 123 and RIT 125contained a 5'sequence (22 nucleotides) complementary to the pUC vector.

Polymerase chain reaction

The DNA (100 ng per sample) was amplified with the RIT113 and RIT114primers at final concentration of 1 μM. The PCR was carried out in 100μl of a solution consisting of 0.2 mM each of the dNTP:s 20 mM Tris-HCl,pH 8.8, 15 mM (NH)₂ SO₄ 1.5 mM MgCl₂ 0.1% Tween 20, 0.1 mg/ml gelatinand 2.5 units of Taq polymerase (United States Biochemical Corp, USA) ina thermal cycler (Perkin-Elmer, USA) for 25 cycles of 1 min. at 96° C.and 2 min. at 65° C. For amplification with a second pair or primers asmall aliquot (3 μl of 1:100 μl dilution) of the PCR product amplifiedwith the primers RIT113 and RIT114 was transferred to a second PCR. Thiswas carried out at the conditions described above using the biotinylatedprimer RIT123 and the primer RIT125 at 0.1 μM concentration.

Immobilization on magnetic beads

Magnetic beads containing covalently attached Streptavidin. Dynabeads®M280 Streptavidin, were obtained from Dynal (N-0212 Oslo 2, Norway). Aneodynium-ironboron permanent magnet MPC-E, (Dynal, Norway) was used tosediment the beads in the tubes during supernatant removal and washingprocedures. The PCR mixture was added to 300 μg of Dynabeads® M280Streptavidin previously washed with TE buffer (10 mM Tris pH 7.5, 1 mMEDTA) containing 1M NaCl, and incubated 15 min. at room temperature.

Preparation of single stranded vector

A total of 5 μg of pUC18 (Pharmacia LKB Biotechnology, Sweden) wasdigested with EcoRI, phenol extracted followed by desalting with asephadex G50 spin column. Biotin-7-dATP (BRL, USA) was introduced byKlenow polymerase. After heat inactivating and desalting it was digestedwith HindIII and immobilized on 1 mg of Dynabeads® M 280 Streptavidin.Strands were separated using 40 μl 0.12M NaOH. The supernatant wasneutralized by adding 3.7 μl 1.7M HAc and 4.4 μl 10×TE buffer pH 7.5.The concentration was estimated using agarose gel electrohoresis.

Direct cloning

The PCR amplified apoE gene region containing ends complementary to thesingle stranded vector was immobilized on magnetic beads. The strandswere separated using 40 μl 0.12M NaOH. The supernatant was neutralizedby adding 3.7 μl 1.7M HAc and 4.4 μl 10×TE buffer PH 7.5 andconcentration estimated. 100 ng single stranded vector was mixed with anequal amount of single stranded genomic amplified DNA in a total of 5μl. Transformation of competent E. coli DH5α cells (BRL.USA) wasperformed according to the manufacturer's direction.

Sequencing reactions

The immobilized dsDNA was washed with 50 μl TE buffer and then incubatedwith 10 μl 0.1M NaOH for 15 min. at room temperature. The supernatantwas removed and the beads containing the immobilized single stranded DNAwere washed with 50 μl 0.15M NaOH and 3 times with 50 μl TE buffer. Thevolume was adjusted to 13 μl with H₂ O.

All sequencing reactions were performed with reagents from the AutoReadsequencing kit (Pharmacia LKB Biotechnology, Sweden). 2 μl (1 pmol) of afluorescent labelled universal sequencing primer was added to eachEppendorf tube together with 2 μl of annealing buffer. The annealingmixtures were incubated at 65° C. for 15 min. and allowed to cool toroom temprature for 15 min. 1 μl of a MID solution (Manganese,Isocitrate and DTT) was added to each annealing mixture together with 2μl T7 polymerase (1.5 units/μl) and 2.5 μl of respectively A,C,G,Tsequencing mixture (containing c⁷ dGTP instead of dGTP) were prewarmedat 37° C. for 1 min. using a microtest plate (Sarstedt, West Germany)before 4.5 μl of the annealing mixture was added to each sequencingmixture and incubated for 5 min. at 37° C. 5 μl of deionized formamidecontaining Blue Dextran was added to stop the reactions. the microtestplate was heated at 80° C. for 2 min. and 5 μl was loaded on a 6%sequencing gel run on an automatic sequencing apparatus with detectionof fluorescent bands during electrophoresis (A.L.F, Pharmacia LKBBiotechnology, Sweden).

RESULTS

The principle for solid phase cloning

A basic concept for cloning using a solid phase approach is shown inFIG. 2. A single stranded vector fragment is provided by selectivelyincorporating biotin into one of the strands of the vector DNA. This isachieved by restriction and fill-in using a biotin-dNTP and DNApolymerase. The double stranded DNA is bound to magnetic beadscontaining streptavidin and the single stranded vector is simply elutedwith alkali. If more single stranded vector fragment is needed, arun-off extension reaction with DNA polymerase can be carried out, oneor several times, and the extended fragment is again eluted with alkaliand collected. This yields a well defined single stranded vector withflanking sequences represented by the A and B region (FIG. 2).

Alternatively, the vector fragment can also be prepared by theapparently simple PCR procedure using specific vector primers. However,caution must be taken as PCR of large fragments might create randompolymerase errors into the vector part which is not easy to control. Inaddition, PCR of larger fragments (>3 Kb) is not yet straight forward interms of reproducibility and yield.

The "insert" DNA to be cloned is obtained by PCR, using specific primerswith handles consisting of the vector regions A and B, respectively(FIG. 2). For genomic DNA specific primers need to be synthesized, whilefor gene fragments inserted into vectors it is possible to use generalPCR primers designed for the solid phase cloning. In both cases, one ofthe primers contains a biotin in the 5' end, which allows the in vitroamplified material to be captured by Streptavidin coated magnetic beads.A single stranded insert fragment with flanking regions (A' and B')complementary to the vector fragment can subsequently be eluted withalkali. The two single stranded fragments can then be mixed to form agap-duplex molecule (FIG. 2) and transformed directly in E. coli. Themethod allows cloning of any fragment, from any origin (chromosomal DNA,plasmid DNA etc.) independent on restriction sites. No restrictionenzymes and ligase are needed and very high yield of recombinants isexpected since the vector and insert alone should not givetransformants.

Design of primers for the human apolipoprotein E gene

The solid phase approach was tested by analysing the chromosomal genefragments of the apolipoprotein E (ApoE) gene. Mature ApoE is a 299amino acid protein which plays an important role in the lipoproteinmetabolism (2). In humans, three major apoE isoproteins exist (3),apoE2, E3 and E4, encoded by the three different co-dominant alleles(E2, E3 and E4). Besides apoE2 (cys₁₁₂), cys₁₅₈), E3 (cys₁₁₂, arg₁₅₈)and E4 (arg₁₁₂, arg₁₅₈), several rare, independent apoE isoproteins haverecently been described in this region (4). Sequencing of individualchromosomes is therefore of importsance to establish the structure ofthe alleles in this region on both chromosomes.

The polymorphic region of apoE is due to single base substitutions attwo loci, a C/T nucleotide change at codon 112 and a similar C/Tnucleotide change at codon 158. Both mutations give rise to an arginineto cysteine replacement (FIG. 3). To test the solid phase cloningprotocol (FIG. 2), a nested primer approach was followed with a pair ofouter primers and another pair of inner primers (FIG. 3) used for thecloning (RIT 123 and RIT 125) and thus containing handle sequencescomplementary to the pUC vector. The downstream primer (RIT123) containsa 5'-biotin to allow capture of the amplified chromosomal DNA.

Direct solid phase cloning of the chromosomal apoE gene

Blood samples from several human patients were prepared (1) and used fora two step (25 cycles each) PCR procedure using the outer primers (RIT113 and RIT 114) and the inner primers (RIT123 and RIT 125). For detailssee Materials and Methods. Analysis by agarose electrophoresis showed aband of the expected 290 bp for all samples (data not shown).

The PCR product for one of the patients was bound to the magnetic beadsand the single stranded insert was eluted with alkali and neutralized.The vector fragment from pUC18 was prepared by restriction with EcoRI,followed by a fill-in reaction with biotin-dATP and by restriction withHindIII. After binding to magnetic beads, the single stranded vector waseluted with alkali and neutralized.

The single stranded vector and the eluted single stranded PCR fragmentwere mixed and used directly to transform competent E.coli cells.Several hundred colonies were obtained, while transformation with thevector or the insert alone, gave a few or none colonies, respectively(data not shown). Restriction mapping of purified plasmids from 20colonies showed that 19 recominants had plasmid with an insert of thecorrect size (data not shown).

Sequencing of positive clones

Six of the colonies from the mixing experiment were sequenced directlyby the solid phase method (5). Examples of two of the samples are shownin FIG. 4. The sequence data show that one of the clones (FIG. 4A) has aG/G in the polymorphic positions (arrows) corresponding to a C/C loci.In contrast the other clone has an A/A (FIG. 4B) in these positions,suggesting a T/T genotype. Of the six clones sequences, four were of theT-T genotype, while two were of the C-C type. Clearly, the patient is aheterozygote E2/4 with an arg₁₁₂, arg₁₅₈ coded by one of thechromosomes, while the other chromosome codes for an apoE protein with acys₁₁₂, cys₁₅₈. An interesting observation is that four of the sixclones sequenced showed different unique sequences outside the alleliccodons, such as the T at position 127 in FIG. 4B. These nucleotidechanges correspond to polymerase errors obtained during the repetitivePCR procedure. Two of the clones, such as the one showed in FIG. 4A,contained a sequence without any random errors.

DISCUSSION

We have for the first time shown that magnetic separation of DNA can beused for efficient assembly of recombinant DNA molecules. Here,individual human chromosomal gene fragments were directly cloned by PCRsimply by using a 22 basepair primer "handle" complementary to the endsof the linear vector fragment during the PCR. The cloning step issimple, rapid and involves no ligation and restriction enzyme reactions.The yield of specific chromosomal clones was approximately 90% withoutusing any positive selection or any special host strain.

The fragment produced can be cloned into different vectors provided thatcomplementary overlap regions exist between the vector and the insert.This allows for the use of a battery of prepared single strandedvectors, in which the insert is directly cloned simply by mixing andtransforming. This procedure is well suited for automation since noprecipitations, extractions, filtrations or centrifugations are neededand no enzymatic steps are performed. Thus both manual, semi-automatedprocedures can be envisioned. This is significant for large scaleprojects, where it is desired to insert a gene fragment into manydifferent vectors, such as various expression vectors.

The cloning protocol (FIG. 2) has the advantage that the same principalresult can be accomplished both with or without using PCR to prepare thelinearised vector. This is important since accumulated polymerase errorsare a major concern when ever PCR products are cloned (6), which makesit strongly desired to sequence the cloned material. For large sizedfragments such as cloning and expression vectors, this is difficult andtime consuming. Therefore, solid phase cloning protocols that do notdepend on PCR produced vector are attractive. In this Example, thevector was produced by the restriction-fill-in procedure to avoid PCRamplification of larger sized fragments while the cloned chromosomalApoE gene was obtained by PCR.

As expected, the cloned material has considerable amounts of randomlyintroduced errors (FIG. 4). As the error frequency of Taq polymerase isapproximately 10⁻⁴ (2) and the PCR was carried out for 2×25 cycles, thetheoretical error frequency for each nucleotide is roughly 1 out of 200(10.000/50). This background (less than 1 per cent) is obviously notobserved when a direct genomic sequencing is carried out. Even if onlyone template molecule is present in the original sample and an error isintroduced in the first cycle, the correct signal at that position istheoretically 75% and can possibly be discriminated from background.

In contrast, when the same material is cloned, for example in E. coli,the random errors become prominent readily detectable. For a fragment ofthe size of 200 basepairs (as the human apoE gene described here) anderror frequency of 1/200 means that most fragments will contain anintroduced error. The results of the cloning confirm this as 4 out of 6cloned fragments contained random errors. The high frequency ofintroduced errors detected in the fragments after cloning can of coursebe limited by performing less PCR cycles or to avoid the nested primerapproach. It might also be possible to use a less error pronepolymerase. However, as long as relatively short fragments are clonedand a straight forward sequencing of several clones can be performed, itshould be possible to find a clone with a correct sequence by a smallscale screening. Note, that the correct genomic sequence can bedetermined and defined by the direct genomic solid phase sequencing (5).

Interestingly, the cloning procedure followed by direct sequencing usingan automated electrophoresis instrument can be used to separate andidentify both alleles in a hetrozygote. Thus, a diagnostic evaluationmay be performed in an automated manner, which is in contrast to mostpolymorphic analysis based on hybridization (7). A direct clinicalsequencing approach for diagnosis of polymorphism has the additionaladvantages that non-expected nucleotide changes in close proximity tothe allele analyzed will be discovered and that the flanking sequencescan be used as positive control to show that the PCR reaction has beensuccessful and specific.

In conclusion, the direct cloning procedure according to the inventionwas able to isolate and sequence individual human chromosomal apoE genefragments. Thus, the relationship between two separated alleles could beresolved. The results demonstrate the selectivity and efficiencyobtained by the solid phase approach as all the recombinants sequencedhad the desired chromosomal gene fragment. The cloning using magneticseparation is thus a highly efficient, rapid and simple tool to obtainrecombinant molecules, although caution must be taken to minimize theeffect of random errors introduced during the PCR by the Taq polymerase.This method and similar procedures can facilitate considerably theassembly of cloned genes in molecular biology and biotechnology.

References

1. Syvanen, A. -C., K. Aalto-Setala, K. Kontula and H. Soderlund. 1989.Direct sequencing of affinity-captured human DNA: application to thedetection of apolipoprotein E polymorphism. FEBS Lett. 258:71-74.

2.Mahley, R. W. 1988 Apolipoprotein E: Cholesterol Transport Proteinwith Expanding Role in Cell Biology. Science 240:622-630.

3. Zannis, V. I., P. W. Just and J. L. Breslow. 1981. Human apoliproteinE isoprotein subclasses are genetically determined. Am J. Hum. Gent 65232-236.

4. Paik, Y. -K., D. J. Chang, C. A. Reardon, G. E. Davies, R. W. Mahleyand J. M. Taylor, 1985. Nucleotide sequence and structure of the humanapolipoprotein E gene. Proc. Natl. Acad. Sci. USA 82:3345-3449.

5. Hultman, T., S. Stahl, E. Hornes and M. Uhlen. 1989. Direct solidphase sequencing of genomic and plasmid DNA using magnetic beads andsolid support. Nucleic Acids Res. 17:4937-4946.

6. Tindall, K. R. and T. A. Kunkel. 1988 Fidelity of DNA synthesis bythe Thermus aquaticus DNA polymerase, Biochemistry 27:6008-6013.

7. Caskey, C. T. 1987, Disease Diagnosis by Recominant DNA Methods.Science 236:1223-1229.

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 7                                                  (2) INFORMATION FOR SEQ ID NO:1:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 24 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: Other nucleic acid;                                       (A) DESCRIPTION: Synthetic DNA oligonucleotide                                (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                       TGCTTCCGGCTCGTATGTTGTGTG24                                                    (2) INFORMATION FOR SEQ ID NO:2:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 24 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: Other nucleic acid;                                       (A) DESCRIPTION: Synthetic DNA oligonucleotide                                (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                       AAAGGGGGATGTGCTGCAAGGCGA24                                                    (2) INFORMATION FOR SEQ ID NO:3:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 21 base pairs                                                     (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                     (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: Other nucleic acid;                                       (A) DESCRIPTION: Synthetic DNA oligonucleotide                                (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                       CGGCTCGTATGTGTGGAATTG21                                                       (2) INFORMATION FOR SEQ ID NO:4:                                              (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 44 base pairs                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: Other nucleic acid;                                       (A) DESCRIPTION: Synthetic DNA oligonucleotide                                (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                       CCAATGCATATGTGGTCGGCTACGCTGGAAATAGCGCATATTTC44                                (2 ) INFORMATION FOR SEQ ID NO:5:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 41 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                       CCAATGCATATGTGGTCGGCTACCGTGCTGGAAATAGCGCA 41                                  (2) INFORMATION FOR SEQ ID NO:6:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 330 base pairs                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: HUMAN LIPOPROTEIN E GENE                                        (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                              (B) LOCATION: 1..330                                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                       AAGGAGTTGAAGGCCTACAAATCGGAACTGGAGGAACAACTGACCCCG48                            LysGluLeuLysAlaTyrLysSerGluLeuGluGluGlnLeuThrPro                              15 1015                                                                       GTGGCGGAGGAGACGCGGGCACGGCTGTCCAAGGAGCTGCAGGCGGCG96                            ValAlaGluGluThrArgAlaArgLeuSerLysGluLeuGlnAlaAla                              20 2530                                                                       GAGGCCCCGCTGGGCGCGGACATGGAGGACGTGCGCGGCCGCCTGGTG144                           GluAlaProLeuGlyAlaAspMetGluAspValArgGlyArgLeuVal                              35 4045                                                                       CAGTACCGCGGCGAGGTGCAGGCCATGCTCGGCCAGAGCACCGAGGAG192                           GlnTyrArgGlyGluValGlnAlaMetLeuGlyGlnSerThrGluGlu                              5055 60                                                                       CTGCGGGTGCGCCTCGCCTCCCACCTGCGCAAGCTGCGTAAGCGGCTC240                           LeuArgValArgLeuAlaSerHisLeuArgLysLeuArgLysArgLeu                              6570 7580                                                                     CTCCGCGATGCCGATGACCTGCAGAAGCGCCTGGCAGTGTACCAGGCC288                           LeuArgAspAlaAspAspLeuGlnLysArgLeuAlaValTyrGlnAla                              85 9095                                                                       GGGGCCCGCGAGGGCGCCGAGCGCGGCCTCAGCGCCATCCGC330                                 GlyAlaArgGluGlyAlaGluArgGlyLeuSerAlaIleArg                                    100105 110                                                                    (2) INFORMATION FOR SEQ ID NO:7:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 110 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                       LysGluLeuLysAlaTyrLysSerGluLeuGluGluGlnLeuThrP ro                             151015                                                                        ValAlaGluGluThrArgAlaArgLeuSerLysGluLeuGlnAlaAla                              202530                                                                        GluAla ProLeuGlyAlaAspMetGluAspValArgGlyArgLeuVal                             354045                                                                        GlnTyrArgGlyGluValGlnAlaMetLeuGlyGlnSerThrGluGlu                              50 5560                                                                       LeuArgValArgLeuAlaSerHisLeuArgLysLeuArgLysArgLeu                              65707580                                                                      LeuArgAspAlaAspAspLeuGlnLys ArgLeuAlaValTyrGlnAla                             859095                                                                        GlyAlaArgGluGlyAlaGluArgGlyLeuSerAlaIleArg                                    100105 110                                                                    __________________________________________________________________________

We claim:
 1. A method of cloning a target DNA, wherein(a) said target DNA is amplified by PCR to obtain single-stranded amplified target DNA, (b) the single-stranded, amplified target DNA is contacted with a single-stranded, linear vector DNA having terminal regions which are complementary to respective terminal regions of said amplified target DNA, wherein said complementary terminal regions overlap and hybridize to form a cyclic DNA product comprising single-stranded target DNA, single-stranded vector DNA, and two double-stranded regions formed by hybridization of said overlapping complementary terminal regions of the single stranded vector and the single stranded amplified target DNA, wherein said double-stranded regions are separated from each other by a region of single-stranded target DNA, and wherein said cyclic DNA product is essentially single-stranded apart from said double-stranded complementary regions, (c) said cyclic DNA product is introduced into a host organism, and (d) said cyclic DNA product is cloned by replication of said host organism.
 2. A method of cloning target DNA as claimed in claim 1, wherein the PCR is two-stage and uses nested primers, each primer of the second stage being complementary to a sequence of the target DNA between the sites complementary to the first stage primers.
 3. A method as claimed in claim 2, wherein said second stage primers comprise single-stranded 5' nucleotide extensions which form terminal regions in one strand of the amplified target DNA that are complementary to the terminal regions of said single stranded vector.
 4. A method as claimed in claim 1, wherein each of said double-stranded regions of the cyclic DNA product comprises at least one restriction site.
 5. A method as claimed in claim 1, wherein PCR is effected using two primers, one of which carries means for immobilization or is already immobilized.
 6. A method as claimed claim 1, wherein site-specific mutagenesis is effected between the steps of PCR amplification and formation of the cyclic DNA product.
 7. A method as claimed in claim 6, wherein amplified target DNA is immobilized prior to said site-specific mutagenesis.
 8. A method as claimed in claim 1, further comprising the steps of recovering cloned DNA from said host organism having introduced therein said cyclic DNA and then determining in said cloned DNA a sequence in said target DNA.
 9. A method as claimed in claim 1, wherein the single stranded linear vector DNA is produced by a method comprising immobilizing only one end of only one strand of a double-stranded linear vector DNA and separating the non-immobilized strand from the immobilized strand wherein said non-immobilized strand is said single stranded linear vector.
 10. A kit for cloning target DNA comprising:(a) a linear vector(i) in single stranded form having terminal regions that are complementary to respective terminal regions of a target DNA, whereby said complementary terminal regions overlap and hybridize to form a cyclic DNA product comprising single stranded target DNA, single stranded vector DNA, and two double stranded DNA regions wherein said double stranded regions are separated from each other by a region of single stranded target DNA, and wherein said cyclic DNA product is essentially single stranded apart from said double stranded complementary regions; or (ii) in double stranded form and immobilized by one end of one strand thereof, wherein the single stranded linear vector is produced by separation of the non-immobilized strand from the immobilized strand; (b) a polymerase; (c) two PCR primers for amplification of the target DNA, wherein said PCR primers correspond to the terminal regions of said vector; (d) nucleoside triphosphates.
 11. A kit as claimed in claim 10 additionally comprising means for sequencing target DNA including either labelled primer or labelled nucleoside triphosphates.
 12. A kit as claimed in claim 10 additionally comprising means for site-specific mutagenesis including:(a) a site specific mutagenic amplification primer; (b) a second polymerase; and (c) a ligase.
 13. A method of cloning a target DNA according to claim 1, wherein said overlapping complementary regions are 20-250 base pairs in length.
 14. A method of cloning a target DNA according to claim 1, wherein said overlapping complementary regions are about 20 base pairs in length. 